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APPELLANT'S BRIEF ON APPEAL 

Appellants hereby appeal to the Board of Patent Appeals and Interferences 
from the Examiner's Final Rejection of Claims 1-6, 8, and 10-12, which was 
contained in the Office Action mailed September 1 1 , 2008. 

A timely Notice of Appeal was filed March 10, 2009. 

Real Party In Interest 

The real party in interest is Eastman Kodak Company, assignee of the 
inventors' entire interest. 

Related Appeals And Interferences 

No appeals or interferences are known which will directly affect or be 
directly affected by or have bearing on the Board's decision in the pending appeal. 

Status Of The Claims 

Claims 1-6, 8 , and 10-12 stand finally rejected and are the subject of this 

appeal. 

Claims 7 and 9 stand cancelled without prejudice or disclaimer of the 
subject matter presented therein due to the Amendment filed April 27, 2007, 
which was resubmitted on May 16, 2007 in response to a Notice of Non- 
Compliant Amendment dated May 7, 2007. 

Appendix I provides a clean, double spaced copy of the claims on appeal. 

Status Of Amendments 

An Amendment After Final was filed on May 1 1, 2009. An Advisory 
Action dated May 21 , 2009, indicated that such Amendment After Final would not 
be entered for purposes of this Appeal and that the rejection of the claims would 
be maintained. 

Summary of Claimed Subject Matter 

Independent Claim 1 reads as follows* Example citations of support in the 

description for Claim Vs limitations are shown below in bold and in parenthesis. 
Such citations are not intended to be exhaustive. 



-1- 



Claim 1 requires a method for improving scene classification of a 
sequence of digital images* The method includes (a) providing a sequence of 
images captured in temporal succession (page 5, lines 28-29; FIG. 1, reference 
numeral 10; and page 6, lines 30-32), at least two pairs of consecutive images 
(page 6, lines 4-5) in the sequence of images having different elapsed times 
between their capture (page 6, lines 3-6; page 12, line 15 to page 13, line 3; and 
page 14, lines 3-14, including Table 4). The method also includes (b) classifying 
each of the images individually based on information contained in the individual 
image (page 5, lines 29-32; and FIG. 1, reference numeral 20) to generate an 
initial content-based image classification for each of the images (page 5, line 32 
to page 6, line 2; FIG. 1, reference numerals 30 and 40). In addition, the 
method includes (c) generating a final image classification (page 6, lines 8*11; 
FIG. 1, reference numeral 90) for each image based at least on the respective 
initial content-based image classification and a pre-determined temporal context 
model (page 6, lines 2-3; FIG, 1, reference numerals 60 and 80) that considers 
at least the temporal succession of the sequence of images (page 5, lines 15-19). 
And, the method includes (d) storing the final image classifications in a computer 
readable storage medium (page 5, lines 2-9). 

Grounds of Rejection to be Reviewed on Appeal 

The following issues are presented for review by the Board of Patent 
Appeals and Interferences: 

1 . the propriety of the rejection of Claims 1-2, 4-5 , and 1 2 under 35 
U.S.C. 103(a) in view of "A Recurrent Neural Network Classifier for 
Improved Retrievals of Areal Extent of Snow Cover", Simpson et aL, 
IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 
10, October 2001, pages 2135-2147 (hereinafter the "Simpson 
Article") and "Automatic Image Event Segmentation and Quality 
Screening for Albuming Applications", Loui et aL, IEEE Int'l 
Converence on Multimedia and Expo, July 2000, New York City, New 
York (hereinafter the "Loui Article*') is proper; 
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2. the propriety of the rejection of Claim 3 under 35 U.S.C. 103(a) in 
view of the Simpson Article, the Loui Article, and U.S. Patent No. 
6,977,679 (the "Tretter Patent"); and 

3. the propriety of the rejection of Claims 6, 8, and 10-1 1 under 35 U.S.C. 
103(a) in view of the Simpson Article, the Loui Article, and 
"Integration of Multimodal Features for Video Scene Classification 
based on HMM", Huang et al., IEEE Workshop on Multimedia Signal 
Processing, 1999, pages 53-58 (hereinafter the "Huang Article"). 

Arguments 

Independent Claim 1 , the only independent claim, stands rejected 
under 35 U.S.C, § 103(a) as allegedly unpatentable over the Simpson Article 
modified by the Loui Article. See pages 6-9 of the Final Office Action. 
Appellants respectfully request reversal of this rejection for at least the following 
reasons. 

Claim 1 requires, among other things, two sequential image 
classification processes for each image in a sequence of images captured in 
temporal succession. The first is an initial content-based image classification, 
which is based on information contained in the individual image. The second is a 
final image classification, which is based on both the respective initial content- 
based image classification and a pre-determined temporal context model that 
considers the temporal succession of the sequence of images. 

It appears that Appellants and the Examiner agree that the Simpson 
Article does not teach or suggest the claimed two sequential classification 
processes. See the page 8 of the Final Office Action, first full paragraph, which 
states that "Simpson fails to specifically suggest . . , generating a final image 
classification for each image based at least on the respective initial content-based 
image classification and a predetermined temporal context model . . In 
particular, the Examiner notes that the Simpson Article discloses two 
classification processes, FFNN and RNNCCS. See the Final Office Action, page 
6, paragraph #7, lines 7-9 ("Simpson discloses a method/single image 
classification using feed-forward neural networks (FFNN) and image sequence 
classification using recurrent neural networks (RNNCCS) . .-."). 
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The FFNN process is understood to classify pixels of isolated 
scenes, such as the pixels of a single satellite image, without regard to other 
images. See the Simpson Article, Section III-B-(l), first paragraph ("At 
midlatitudes, data from a given AVHRR instrument typically are received only 
once per daylight period. The . . . [FFNN] network was designed to classify 
isolated scenes such as these."). See also the Simpson Article, Section III-B-(l), 
third paragraph ("Each output corresponds to a class of data (X = clear, Y - cloud, 
Z = snow) and is trained to return a value between 0 and 1 . . . . An output node 
returns zero when the network is maximally certain that a given pixel does not 
correspond to its class . . ..") (bold added for emphasis). See also the Simpson 
Article, FIG. 3(a) with the title "No Feedback" (indicating that feedback for pixel 
values from a previous image is not used in the FFNN process). 

On the other hand, the RNNCCS process is understood to classify 
pixels of images at least in part based on the value of corresponding pixels in a 
previous image. See the Simpson Article, Section III-B-(2), first paragraph 
("GOES provides about twelve daylight scenes per day .... When such data are 
available, .... (RNNCCS) improves classification skill. The RNNCCS uses 
spectral and texture information from the cuirent image in the time series as input, 
along with data from the previous texture input and the previous network output 
(Fig. 3(b)). This allows the network to have an operational 'short term memory' 
of both texture and classification data from the previous image."). See also the 
Simpson Article, Section III-B-(2), second paragraph ("If, for example, the 
previous RNNCCS classification assigned pixel P as snow and the texture data for 
pixel P remains close to its previous value, then the RNNCCS network knows that 
pixel P probability still is snow."). 

Accordingly, the Simpson Article is understood to teach that if 
only an isolated image is available, the FFNN process is used to classify each 
pixel of that image. However, if a series of images is available, the pixels of the 
images are classified using the RNNCCS process, because it improves 
classification skill. See the Simpson Article, Section III-B-(2), first paragraph. 

Because the Simpson Article does not teach or suggest that the 
FFNN and RNNCSS processes are performed on an image in sequence, one 



followed by the other, as would be required by Claim 1 , the Examiner refers to the 
Loui Article* 

Unlike the Simpson Article, which classifies individual pixels in an 
image, the Loui Article groups images based on events (e.g., trip to New York 
City). See the Loui Article, Abstract ("A novel event segmentation algorithm was 
created to automatically cluster pictures into events and sub-events for albuming, 
based on date/time metadata as well as color content of the pictures.") 

As best shown by the steps at the top-left of page 2 ("the following 
steps are carried out- . S*) and FIG. 1 , the Loui Article is understood to teach 
running a first-level date/time clustering algorithm to determine event boundaries. 
See Step 1 and the "1 st level" box in FIG, 1, For example, pictures captured on 
one day may be grouped into one event and pictures on the next day may be 
grouped into a separate event See Section IV - Results and Discussion* After the 
date/time event clustering, a second-level clustering is performed by analyzing 
image content (color similarity/block-based histogram algorithm). The second- 
level clustering is used to verify the event boundaries (generated from the first- 
level clustering) and to ensure that each cluster is composed of several groups of 
pictures. See Steps 2 and 3 and the "2 nd level" box in FIG. 1 , For example, the 
color characteristics of the last picture taken on the first day may be compared to 
the color characteristics of the first picture taken on the second day to make sure 
the two days represent different events. See Section II.B - Block-Based 
Histogram Correlation, first paragraph ("If the average intersection is below a low 
threshold we can say that the two pictures are sufficiently different and may not be 
part of the same event."). After the analysis of image content to verify event 
boundaries and to ensure that each cluster has a proper number of groups of 
pictures, some of the groups are merged if the date/time separations between the 
groups are not meaningful. See step 4. At step 5, subject arrangement within an 
event is checked to group similar pictures together. Finally, at step 6, refinement 
is carried out to check if there are too many groups with an isolated picture, and 
whether some of these groups can be merged. See the "refinement" box in FIG. 1 . 

In order to reject Claim 1 , the Examiner argues that the Loui 
Article's "1 st level cluster events by date/time" box of FIG. 1 would obviously be 
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replaced with the Simpson Article's RNNCCS classification, and that the "2 nd 
level cluster events by image content" box of FIG. 1 would obviously be replaced 
with the Simpson Article's FFNN classification. See the Final Office Action, 
page 2, paragraph #3. If Appellants understand the Examiner's position correctly, 
the "2 nd level" box of FIG. 1 is alleged to correspond to Claim 1 's initial content- 
based classification, and the "1 st level" box is alleged to correspond to Claim l's 
pre-determined temporal context model. In this case, the Examiner concludes that 
the Refinement by split and merge" box of the Loui Article's FIG. 1 allegedly 
corresponds to Claim l's final image classification. 

The Examiner has the burden of establishing a prima facie case of 
unpatentability* Appellants respectfully submit that this burden has not been met 
and that one of ordinary skill in the art would not combine the teachings of the 
Simpson Article and the teachings of the Loui Article in the manner suggested in 
the Final Office Action. 

In particular, the teachings of the Simpson Article and the Loui 
Article are incompatible and teach away from each other. And, even if they were 
compatible, results would be unpredictable. 

A. The Simpson Article and the Loui Article Serve Different 
Functions: One Classifies Pixels and the Other Groups Whole Images 

As discussed earlier, the Simpson Article is understood to teach 
that its FFNN and RNNCCS classification processes each output a pixel-bv-pixel 
classification for each image input See the Simpson Article, Section IILB, third 
paragraph, and FIGS. 3(a) and 3(b). The Loui Article, on the other hand, is 
directed to grouping whole images together based on event similarities. In other 
words, the RNNCCS and FFNN processes serve the function of classifying pixels 
within an image, and the Loui Article's clustering serves a different function of 
grouping images. Accordingly, the Loui Article's event clustering is not suited to 
handle pixel-by-pixel classifications as its 1 st and 2 nd level clustering steps. Such 
incompatibility would lead one of ordinary skill in the art away from combining 
the articles in the manner suggested. 
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B. The Simpson Article's FFNN and RNNCCS Processes Do 
Not Build From One Another, as Required by the Loui Article 

The Loui Article's 2 nd level clustering of events by image content 
is understood to build from initial event clusters generated by the 1 st level 
clustering by date/time. See the dotted line from the "1 st level" box to the "2 nd 
level" box in FIG. 1 of the Loui Article. See also steps 2 and 3 at the upper-left of 
page 2 of the Loui Article (presupposing that event boundaries and clusters 
already exist when performing the 2 nd level clustering). How or why would one 
of ordinary skill in the art use the FFNN process of the Simpson Article to build 
from output from the RNNCCS process. First, the RNNCCS process is not 
understood to generate event clusters to be revised by a 2 nd level clustering 
according to the Loui Article. Second, even assuming for argument's sake that 
the Loui Article was directed to pixel-wise classifications, the FFNN process is 
described by the Simpson Article as using no feedback. See FIG. 3(a) of the 
Simpson Article titled 6€ No Feedback". FFNN is taught to be a stand-alone 
process that does not build from prior classifications. See the Simpson Article, 
Section III-B, first paragraph ("The . . . [FFNN] network was designed to classify 
isolated scenes . . .,")• By definition, then, the FFNN process does not build from 
prior output. Accordingly, it is respectfully submitted that one of ordinary skill in 
the art would look away from using the FFNN process of the Simpson Article as 
the Loui Article's 2 nd level clustering. 



C. No Benefit Exists for Using the RNNCCS and FFNN 
Processes as the Loui Article's 1 st and 2 nd Level Event Clustering 

No benefit appears to be gained by using the RNNCCS process as 
the Loui Article's 1 st level clustering and the FFNN process as the 2 nd level 
clustering. In particular, the Loui Article is described as operating on a plurality 
of pictures. See "input pictures" on the left-hand side of FIG. 1 of the Loui 
Article. If multiple pictures are available, why would the FFNN process, directed 
to isolated images, be used at all? The Simpson Article is understood to state that 
the RNNCCS process produces a better classification result than the FFNN 
process when a sequence of images is available. See the Simpson Article, Section 
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III-B-(2), first paragraph. Consequently, the Simpson Article is understood to 
teach that only the RNNCCS process would be used when a sequence of images is 
available. See the Simpson Article, Section III-B-(2), first paragraph, and Section 
III-B-(l), first paragraph. It appears that in the case of executing both the FFNN 
process and the RNNCCS process on the same sequence of images, results of the 
FFNN process would be wasted, because they are redundant and inferior to the 
results of the RNNCCS process. Accordingly, Appellants respectfully submit that 
the Simpson Article teaches away from using both of the FFNN and RNNCCS 
processes on a sequence of images. 

D, The Loui Article's Refinement Step is Incompatible with 
Outputs of the Simpson Article's FFNN and RNNCCS Processes 

The Loui Article describes that its refinement step "is carried out to 
check if there are too many groups with an isolated picture, and whether some of 
them can be merged." See step 6, upper-left of page 2, and the "refinement" box 
in FIG. 1 of the Loui Article. As far as can be imagined, if RNNCCS were used 
as the 1 st level clustering, and FFNN were used as the 2 nd level clustering of the 
Loui Article, the input to fee refinement box in FIG. 1 would be a plurality of 
images each having its pixels classified (e*g., clear, cloud, or snow). However, the 
refinement step of the Loui Article is looking for groups of images having an 
isolated image. Consequently, the refinement step of the Loui Article seems 
incompatible with the outputs provided by the RNNCCS and FFNN processes. 
Why or how would one of ordinary skill in the art feed dueling pixel-by-pixel 
comparisons from RNNCCS and FFNN into the Loui Article's refinement step, 
which looks for groups of images having isolated images? Appellants respectfully 
submit that one of ordinary skill in the art would look away from the RNNCCS 
and FFNN processes of the Simpson Article for use as the 1 st and 2 nd level event 
clustering of the Loui Article. 

Further in this regard, it appears that the Examiner may be 
inadvertently changing the function of the Loui Article's refinement step. The 
Examiner asserts that it would be obvious to have the FFNN and RNNCCS 
"processes run in parallel in order to act as the two inputs to Loui's refined image 
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classification block , , . in order to combine the content image data and contextual 
image time data into the refined event classification See the Final Office 
Action, page 3, top (citing the Loui Article, FIG. 1, Section I, paragraph 1, lines 
20-24). This citation to the Loui Article, however, is not directed to the function 
of the Loui Article's refinement step. The Loui Article explicitly states that its 
"refinement is carried out to check if there are too many groups with an isolated 
picture, and whether some of them can be merged." See the Loui Article, page 2, 
top-left, step 6. The Loui Article does not state that its refinement step combines 
content image data and contextual image time data into a refined event 
classification. Such a description of refinement apparently does not come from 
the Loui Article, the Simpson Article, or any other piece of prior art, but appears 
to come from the language of Claim 1 itself ("generating a final image 
classification for each image based at least on the respective initial content-based 
image classification and a pre-determined temporal context model ...."). 

E. Conclusion 

For at least these reasons, Appellants respectfully submit that the 
teachings of the Simpson Article and the Loui Article are incompatible and teach 
away from each other. And even if they were compatible, results would be 
unpredictable. Appellants respectfully submit that the Examiner's burden of 
establishing a prima facie case of unpatentability has not been met and 
respectfully request reversal of the rejection of Claim 1 . 



The other claims in this application depend directly or indirectly 
from Claim 1 . The combination of the Simpson Article and the Loui Article is 
necessary to sustain the rejections of these other claims. Since such combination 
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is submitted to be improper at least for the reasons set forth above, reversal of the 
rejections of these claims also is requested. 
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Appendix I - Claims on Appeal 

1. (Previously Presented) A method for improving scene 
classification of a sequence of digital images comprising the steps of: 

(a) providing a sequence of images captured in temporal 
succession, at least two pairs of consecutive images in the sequence of images 
having different elapsed times between their capture; 

(b) classifying each of the images individually based on 
information contained in the individual image to generate an initial content-based 
image classification for each of the images; 

(c) generating a final image classification for each image based at 
least on the respective initial content-based image classification and a pre- 
determined temporal context model that considers at least the temporal succession 
of the sequence of images; and 

(d) storing the final image classifications in a computer readable 
storage medium. 

2. (Original) The method as claimed in claim 1 wherein the 
information used in step (b) includes pixel information, 

3. (Original) The method as claimed in claim 1 wherein the 
information used in step (b) includes capture-device-generated metadata 
information. 
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4. (Original) The method as claimed in claim 1 wherein the pre- 
determined temporal context model in step (c) is independent of elapsed time 
between consecutive images. 



5. (Previously Presented) The method as claimed in claim 1 
wherein the pre-determined temporal context model in step (c) is dependent on 
elapsed time between consecutive images in the sequence, 

6* (Original) The method as claimed in claim 1 wherein the pre- 
determined temporal context model is a causal Hidden Markov Model dependent 
on a previous image. 

7. (Cancelled) 

8. (Original) The method as claimed in claim 1 wherein the pre- 
determined temporal context model is a non-casual model dependent on both a 
previous image and a subsequent image. 

9. (Cancelled) 

10. (Original) The method as claimed in claim 1 wherein the 
temporal context model is imposed using Viterbi algorithm. 
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1 1 . (Original) The method as claimed in claim 1 wherein the 
temporal context model is imposed using a belief propagation algorithm. 

12. (Previously Presented) Hie method as claimed in claim 1 
wherein the pre-determined temporal context model in step (c) is dependent on 
elapsed time between consecutive images in the sequence, such that different 
elapsed times between a particular pair of consecutive images produces a different 
revised image classification for a later-captured image of the particular pair of 
consecutive images. 
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Appendix II - Evidence 

NONE 
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Appendix III — Related Proceedings 

NONE 
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